Comparing Graph Representations of Protein Structure for Mining Family-Specific Residue-Based Packing Motifs
نویسندگان
چکیده
We find recurring amino-acid residue packing patterns, or spatial motifs, that are characteristic of protein structural families, by applying a novel frequent subgraph mining algorithm to graph representations of protein three-dimensional structure. Graph nodes represent amino acids, and edges are chosen in one of three ways: first, using a threshold for contact distance between residues; second, using Delaunay tessellation; and third, using the recently developed almost-Delaunay edges. For a set of graphs representing a protein family from the Structural Classification of Proteins (SCOP) database, subgraph mining typically identifies several hundred common subgraphs corresponding to spatial motifs that are frequently found in proteins in the family but rarely found outside of it. We find that some of the large motifs map onto known functional regions in two protein families explored in this study, i.e., serine proteases and kinases. We find that graphs based on almost-Delaunay edges significantly reduce the number of edges in the graph representation and hence present computational advantage, yet the patterns extracted from such graphs have a biological interpretation approximately equivalent to that of those extracted from distance based graphs.
منابع مشابه
Identification of family-specific residue packing motifs and their use for structure-based protein function prediction: I. Method development
Protein function prediction is one of the central problems in computational biology. We present a novel automated protein structure-based function prediction method using libraries of local residue packing patterns that are common to most proteins in a known functional family. Critical to this approach is the representation of a protein structure as a graph where residue vertices (residue name ...
متن کاملIdentification of family-specific residue packing motifs and their use for structure-based protein function prediction: II. Case studies and applications
This paper describes several case studies concerning protein function inference from its structure using our novel approach described in the accompanying paper. This approach employs family-specific motifs, i.e. three-dimensional amino acid packing patterns that are statistically prevalent within a protein family. For our case studies we have selected families from the SCOP and EC classificatio...
متن کاملDistance-based identification of structure motifs in proteins using constrained frequent subgraph mining.
Structure motifs are amino acid packing patterns that occur frequently within a set of protein structures. We define a labeled graph representation of protein structure in which vertices correspond to amino acid residues and edges connect pairs of residues and are labeled by (1) the Euclidian distance between the C(alpha) atoms of the two residues and (2) a boolean indicating whether the two re...
متن کاملCharacterizing the regularity of tetrahedral packing motifs in protein tertiary structure
MOTIVATION While protein secondary structure is well understood, representing the repetitive nature of tertiary packing in proteins remains difficult. We have developed a construct called the relative packing group (RPG) that applies the clique concept from graph theory as a natural basis for defining the packing motifs in proteins. An RPG is defined as a clique of residues, where every member ...
متن کاملOptimized Null Model for Protein Structure Networks
Much attention has recently been given to the statistical significance of topological features observed in biological networks. Here, we consider residue interaction graphs (RIGs) as network representations of protein structures with residues as nodes and inter-residue interactions as edges. Degree-preserving randomized models have been widely used for this purpose in biomolecular networks. How...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of computational biology : a journal of computational molecular cell biology
دوره 12 6 شماره
صفحات -
تاریخ انتشار 2005